File system rollback to previous point in time

ABSTRACT

A technique for performing continuous data protection and point-in-time recovery for file systems includes performing continuous replication to maintain a replica of a file system by writing changes in the file system to a journal and then writing the changes from the journal to the replica. In response to receiving a request to roll back the replica to a previous point in time, the technique accesses the journal to identify changes made to the replica since the previous point in time and performs undo operations to undo the identified changes and restore the replica to its state at the previous point in time.

BACKGROUND

Data storage systems commonly employ continuous data protection (CDP),also known as “continuous replication,” for protecting the data theystore. Continuous replication operates on storage volumes using FibreChannel or iSCSI (Internet Small Computer System Interface), forexample, to replicate data writes performed on storage volumes at asource to replicas of the storage volumes maintained at a destination.Continuous replication generally allows administrators to performpoint-in-time recovery of a volume to a previous state with finegranularity.

A well-known solution for continuous data protection is the RecoverPointsystem available from EMC Corporation of Hopkinton, Mass. RecoverPointsystems include a replication splitter and one or more local appliances,both at a source data storage system and at a destination data storagesystem. As the source processes IO (Input/Output) requests that specifydata to be written to a particular volume, the replication splitter atthe source intercepts the IO requests and sends them to the localappliance. The appliance at the source communicates with the applianceat the destination, and the two appliances orchestrate the storage ofthe data specified in the IO requests at the destination. In thismanner, the destination is made to store a current, or nearly current,replica of the volume. In addition, journaling of changes made to thereplica allow one to achieve point-in-time recovery in the event of afailure at the source or as otherwise desired.

SUMMARY

Although continuous data protection can provide a reliable approach toreplicating data and providing point-in-time recovery for storagevolumes, it is not an approach that works natively with file systems.Unlike volumes, in which data are addressed using block-based semantics,e.g., by specifying LUNs (logical unit numbers) and offset ranges, datain file systems are generally accessed by specifying directories andfile names. In addition, access to data in volumes is generally achievedusing Fibre Channel or iSCSI protocols, whereas access to data in filesystems is generally achieved using NFS (Network File System), CIFS(Common Internet File System), or SMB (Server Message Block) protocols.Thus, the benefits afforded by continuous data protection are generallynot available to file systems.

In contrast with these prior approaches, in which continuous dataprotection and point-in-time recovery are limited to storage volumes, animproved technique provides continuous data protection and point-in-timerecovery for file systems. The technique includes performing continuousreplication to maintain a replica of a file system by writing changes inthe file system to a journal and then writing the changes from thejournal to the replica. In response to receiving a request to roll backthe replica to a previous point in time, the improved technique accessesthe journal to identify changes made to the replica since the previouspoint in time and performs undo operations to undo the identifiedchanges and restore the replica to its state at the previous point intime.

In some examples, the replica of the file system is realized as acontainer file in a container file system in a data storage system. Thedata storage system includes a mapping layer to expose the containerfile as a volume. Continuous replication may then operate on theresulting volume-file as it would on any other volume, and thus mayperform continuous data protection and point-in-time recovery on thefile system.

In some examples, multiple file systems are grouped together in aconstruct referred to herein as a VSP, or Virtualized Storage Processor,which acts to aggregate multiple file systems under a single object. Insome examples, VSPs may include other objects besides file systems, suchas LUNs and VVols (virtual volumes), for example. In accordance withembodiments disclosed herein, the improved technique groups together themultiple file systems and/or other objects and performs continuous dataprotection on those objects as a single unit. The improved techniquefurther enables one to roll back a replica of a VSP, including all ofits data objects, consistently to a previous point in time. Thus, in anexample, point-in-time recovery is made available for both file systemsand VSPs.

In a particular example, recovery of a file system or a VSP to aprevious point in time is performed as part of DR (Disaster Recovery)testing. For instance, if a current version of a file system or VSPappears to be corrupted, an administrator can roll back the replica to aprevious point in time, e.g., to get behind the corruption. Theadministrator may then perform DR testing and resume from the previouspoint in time or from some other point in time.

Certain embodiments are directed to a method of managing file systemreplicas in a data storage system. The method includes performingcontinuous replication to maintain a replica of a file system, thecontinuous replication (i) specifying changes to be made to the filesystem and mirrored to the replica, (ii) persisting the changes andassociated timestamps in a journal, and (iii) applying the changespersisted in the journal to the replica. The method further includesreceiving a request to roll back the replica of the file system to aprevious point in time and, in response to receiving the request, (i)accessing the journal to identify, based on the timestamps, a set of thechanges made to the replica since the previous point in time and (ii)undoing the set of the changes in the replica to restore the replica tothe previous point in time.

Other embodiments are directed to a data storage system includingcontrol circuitry constructed and arranged to perform a method ofmanaging file system replicas in a data storage system, such as themethod described above. Still other embodiments are directed to acomputer program product. The computer program product storesinstructions which, when executed by control circuitry, cause thecontrol circuitry to perform a method of managing replicas in a datastorage system, such as the method described above. The replicas may bereplicas of file systems or replicas of VSPs.

BRIEF DESCRIPTION OF THE SEVERAL VIEWS OF THE DRAWINGS

The foregoing and other features and advantages will be apparent fromthe following description of particular embodiments of the invention, asillustrated in the accompanying drawings, in which like referencecharacters refer to the same or similar parts throughout the differentviews. In the accompanying drawings,

FIG. 1 is a block diagram showing an example environment in whichimproved techniques hereof can be practiced;

FIG. 2 is a block diagram showing an example IO stack of a storageprocessor of FIG. 1;

FIG. 3 is a block diagram showing example features of the IO stack ofFIG. 2 in additional detail;

FIG. 4 is a block diagram showing example records of a configurationdatabase that associates VSPs with data objects;

FIG. 5 is a block diagram showing an example arrangement for performingcontinuous replication of a VSP between a first data storage system at afirst site and a second data storage system at a second site;

FIG. 6 is a block diagram showing example contents of a journal used inperforming continuous replication and point-in-time recovery;

FIG. 7 is a block diagram showing an example of pending metadata changesin transaction logs being synced to respective file systems of a VSPwhen performing point-in-time recovery;

FIGS. 8A and 8B show an example technique for handling data writes to aVSP during DR testing to allow the writes to be undone;

FIG. 9 is a screen shot showing an example screen generated by agraphical user interface (GUI) application for establishing replicationsettings of data objects on a per-user-object basis and for managinglife cycles of user objects;

FIG. 10 is a screen shot showing another example screen generated by thegraphical user interface (GUI) application for establishing replicationsettings for a VSP, for rolling back to a previous point in time, andfor entering and exiting a DR testing mode; and

FIG. 11 is a flowchart showing an example method for managing filesystem replicas in a data storage system.

DETAILED DESCRIPTION OF THE INVENTION

Embodiments of the invention will now be described. It is understoodthat such embodiments are provided by way of example to illustratevarious features and principles of the invention, and that the inventionhereof is broader than the specific example embodiments disclosed.

An improved technique provides continuous data protection andpoint-in-time recovery for file systems and VSPs (Virtualized StorageProcessors). This document is presented in sections to assist thereader. In the material that follows,

-   -   Section I presents an example environment in which improved        techniques hereof can be practiced. Section I describes, inter        alia, a unified datapath architecture for expressing both        block-based objects and file-based objects as respective        underlying volumes and underlying files, which enables the use        of a common replication approach for both block-based and        file-based objects.    -   Section II presents particular example improvements for        effecting continuous replication of both block-based and        file-based objects on a per-user-object basis under the        direction of a common replication manager.    -   Section III presents particular improvements for performing        point-in-time recovery of file system and VSP replicas.        I) Example Environment Including Unified Datapath Architecture:

FIG. 1 shows an example environment 100 in which embodiments of theimproved technique hereof can be practiced. Here, multiple hostcomputing devices (“hosts”), shown as devices 110(1) through 110(N),access a data storage system 116 over a network 114. The data storagesystem 116 includes a storage processor, or “SP,” 120 and storage 180.The storage 180 is provided, for example, in the form of hard diskdrives and/or electronic flash drives. The data storage system 116 mayinclude multiple SPs like the SP 120 (see, for example, a second SP 120a). For instance, multiple SPs may be provided as circuit boardassemblies, or “blades,” which plug into a chassis that encloses andcools the SPs. The chassis has a backplane for interconnecting the SPs,and additional connections may be made among SPs using cables. It isunderstood, however, that no particular hardware configuration isrequired, as any number of SPs (including a single one) can be providedand the SP 120 can be any type of computing device capable of processinghost IOs.

The network 114 can be any type of network or combination of networks,such as a storage area network (SAN), local area network (LAN), widearea network (WAN), the Internet, and/or some other type of network, forexample. In an example, the hosts 110(1-N) connect to the SP 120 usingvarious technologies. For example, the host 110(1) may connect to the SP120 using Fibre Channel (e.g., through a SAN). The hosts 110(2-N) canconnect to the SP 120 using TCP/IP, to support, for example, iSCSI, NFS,SMB 3.0, and CIFS. Any number of hosts 110(1-N) may be provided, usingany of the above protocols, some subset thereof, or other protocolsbesides those shown. As is known, Fibre Channel and iSCSI areblock-based protocols, whereas NFS, SMB 3.0, and CIFS are file-basedprotocols. The SP 120 is configured to receive IO requests 112(1-N)according to both block-based and file-based protocols and to respond tosuch IO requests 112(1-N) by reading and/or writing the storage 180.

The SP 120 is seen to include one or more communication interfaces 122,a set of processing units 124, and memory 130. The communicationinterfaces 122 include, for example, adapters such as SCSI targetadapters and network interface adapters for converting electronic and/oroptical signals received from the network 114 to electronic form for useby the SP 120. The set of processing units 124 include one or moreprocessing chips and/or assemblies. In a particular example, the set ofprocessing units 124 includes numerous multi-core CPUs. The memory 130includes both volatile memory (e.g., RAM), and non-volatile memory, suchas one or more ROMs, disk drives, solid state drives and the like. Theset of processing units 124 and the memory 130 together form controlcircuitry, which is constructed and arranged to carry out variousmethods and functions as described herein. Also, the memory 130 includesa variety of software constructs realized in the form of executableinstructions. When the executable instructions are run by the set ofprocessing units 124, the set of processing units 124 are caused tocarry out the operations of the software constructs. Although certainsoftware constructs are specifically shown and described, it isunderstood that the memory 130 typically includes many other softwareconstructs, which are not shown, such as various applications,processes, and daemons.

As shown, the memory 130 includes an operating system 134, such as Unix,Linux, or Windows™, for example. The memory 130 further includes acontainer 132. In an example, the container 132 is a software processthat provides an isolated userspace execution context within theoperating system 134. In various examples, the memory 130 may includemultiple containers like the container 132, with each containerproviding its own isolated userspace instance. Although containersprovide isolated environments that do not directly interact (and thuspromote fault containment), different containers can run on the samekernel (not shown) and can communicate with one another usinginter-process communication (IPC) mediated by the kernel. Containers arewell-known features of Unix, Linux, and other operating systems.

In the example of FIG. 1, only a single container 132 is shown. Runningwithin the container 132 is an IO stack 140, a replication manager 162,a Graphical User Interface (GUI)-based application 164, and multipleVSPs (Virtualized Storage Processors) VSP1 to VSPN. A VSP is acollection of data objects, internal file systems, and servers (e.g.,NFS and/or CIFS servers), which together provide a mechanism forgrouping objects and providing a common set of network interfaces suchthat the VSP appears from outside the SP 120 to be similar to a physicalSP. Although certain components are shown residing within the container132, different components alternatively reside in different containers.For example, the GUI-application 164 may run within a dedicatedcontainer and communicate with the replication manager 162 using IPC.

The IO stack 140 provides an execution path for host IOs (e.g., IOrequests 112(1-N)) and includes a front end 142 and a back end 144. Inalternative arrangements, the back end 144 is located on another SP(e.g., SP 120 a) or is provided in a block-based array connected to theSP 120 (e.g., in a gateway configuration).

The replication appliance 160 assists in performing continuousreplication to a second data storage system, which may be locatedlocally to the data storage system 116 or remotely. In an example, thereplication appliance 160 takes the form of a hardware unit, andmultiple such units may be provided, e.g., in a clustered arrangement,such as for supporting strong data compression and other advancedfeatures. For purposes of this document, the replication appliance 160is referred to as a single component. It should be understood, however,that the replication appliance 160 may be implemented using any numberof coordinating units. Continuous replication may also be performedentirely locally, e.g., between a source volume and a destination volumeboth housed within the data storage system 116. The replicationappliance 160 may include a journal 160 for persisting replication dataand for performing other functions

The replication manager 162 orchestrates replication and coordinateswith other data storage systems to conduct and manage replicationsessions. Here, the replication manager 162 establishes replicationsettings on a per-data-object basis, conducts replication sessions withreplica sites, and controls replication activities, including recovery,failover, and DR testing activities.

The GUI application 164 provides a user interface for configuring thereplication manager 162, e.g., for establishing replication settings onparticular data objects. In an example, the GUI application 164 furtherprovides user interface controls for creating data objects, destroyingdata objects, and managing data objects throughout their lifecycles.Particular functions of the GUI application 164 may include, forexample, managing VSPs throughout their lifecycles, accessing replicasof VSPs (e.g., locally or on other data storage systems), rolling backVSP replicas to previous points in time, and performing DR testing. Inone implementation, the GUI application 164 is a modified form of theUnisphere integrated management tool, available from EMC Corporation ofHopkinton, Mass.

As the IO stack 140, replication manager 162, and GUI application 164all run within the same container 132, the IO stack 140, and replicationmanager 162 can communicate with one another using APIs (applicationprogram interfaces) and pointer passing and without the need to use IPC.

The memory 130 is further seen to include a configuration database 170.The configuration database 170 stores configuration informationpertaining to the data storage system 116, including information aboutthe VSPs 1-N and the data objects with which they are associated. Inother implementations, the data storage system 116 stores theconfiguration database 170 elsewhere, such as or in the storage 180, ona disk drive or flash drive separate from the SP 120 but accessible tothe SP 120, e.g., over a backplane or network, or in some otherlocation.

In example operation, the hosts 110(1-N) issue IO requests 112(1-N) tothe data storage system 116. The IO requests 112(1-N) may includeblock-based requests and/or file-based requests. The SP 120 receives theIO requests 112(1-N) at the communication interfaces 122 and passes theIO requests to the IO stack 140 for further processing. At the front end142, processing may include mapping IO requests directed to LUNs, hostfile systems, vVOLs (virtual volumes, available from VMWare Corporationof Palo Alto, Calif.), and other data objects, to block-based requestspresented to internal volumes. Processing in the front end 142 mayfurther include mapping the internal volumes to respective files storedin a set of internal file systems of the data storage system 116. HostIO requests 112(1-N) directed to the SP 120 for reading and writing bothblock-based objects and file-based objects are thus converted to readsand writes of respective volumes, which are then converted to reads andwrites of respective files. As will be described, the front end 142 mayperform continuous replication at the level of the internal volumes,where both block-based objects and file-based objects are presented inblock-based form. Continuous replication may thus be applied to filesystems, as well as to other objects. Also, as will become apparent,continuous replication may further be applied to VSPs, e.g., by formingconsistency groups among the file systems or other objects that make upthe VSPs.

After processing by the front end 142, the IO requests propagate to theback end 144, where the back end 144 executes commands for readingand/or writing the physical storage 180, agnostically to whether thedata read and/or written is directed to a block-based object or to afile-based object.

FIG. 2 shows the front end 142 and back end 144 of the IO stack 140 inadditional detail. Here, the front end 142 is seen to include protocolend points 220, a redirector 222, an object-volume mapping layer 224, areplication splitter 226, a volume-file mapping 228, lower-deck(internal) file systems 230, a storage pool 232, and a basic volumeinterface 236. The back end 144 is seen to include a host side adapter250, a RAID (Redundant Array of Independent Disks) manager 252, and harddisk drive/electronic flash drive support 254. Although IO requests 112enter the IO stack 140 from the top and propagate down (from theperspective of FIG. 2), for ease of understanding, the differentcomponents of the IO stack 140 are described herein from the bottom up.It is understood that IO requests 112 are internal representations ofthe IO requests 112(1-N) as shown in FIG. 1.

At the back end 144, the hard disk drive/electronic flash drive support254 includes drivers that perform the actual reading from and writing tothe storage 180. The RAID manager 252 accesses particular storage units(slices) written or read using RAID protocols. The host side adapter 250provides an interface to the front end 142, for instances in which thefront end 142 and back end 144 are run on different machines. When thefront end 142 and back end 144 are co-located on the same SP, as theyare in FIG. 1, the host side adapter 250 may be omitted or disabled.

Continuing to the front end 142, the basic volume interface 236 providesan interface to the back end 144 for instances in which the front end142 and back end 144 are run on different hardware. The basic volumeinterface 236 may also be disabled in the arrangement shown in FIG. 1.

The storage pool 232 organizes elements of the storage 180 in the formof slices. A “slice” is an increment of storage space, such as 256 MB or1 GB in size, which is derived from the storage 180. The pool 232 mayallocate slices to lower-deck file systems 230 for use in storing theirfiles. The pool 232 may also deallocate slices from lower-deck filesystems 230 if the storage provided by the slices is no longer required.In an example, the storage pool 232 creates slices by accessing RAIDgroups formed by the RAID manager 252, expressing the RAID groups asFLUs (Flare LUNs), and dividing the FLU's into slices.

The lower-deck file systems 230 are built upon slices managed by astorage pool 232 and represent both block-based objects and file-basedobjects internally in the form of files (e.g., container files). Thedata storage system 116 may host any number of lower-deck file systems230, and each lower-deck file system may include any number of files. Ina typical arrangement, a different lower-deck file system is providedfor each data object to be stored. Each lower-deck file system includesone file that stores the data object itself (the primary object) and, insome instances, other files that store snaps of the file that stores theprimary object. Some implementations may provide for storage of otherfiles, such as auxiliary files, which support respective primary files.An example of an auxiliary file is a hybrid log, which stores pendingmetadata transactions directed to a primary object stored as a file inthe same lower-deck file system. Each lower-deck file system 230 has aninode table. The inode table provides a different inode for each filestored in the respective lower-deck file system. The inode table mayalso store properties of the file(s), such as their ownership and blocklocations at which file data are stored.

The volume-file mapping 228 maps each file representing a data object toa respective volume, which is accessible using block-based semantics.The volume-file mapping can be achieved in a variety of ways. Accordingto one example, a file representing a data object is regarded as a rangeof blocks (e.g., 8K allocation units), and the range of blocks can beexpressed as a corresponding range of offsets into the file. Becausevolumes are accessed based on starting locations (logical unit number)and offsets, the volume-file mapping 228 can establish a one-to-onecorrespondence between offsets into the file and offsets into thecorresponding internal volume, thereby providing the requisite mappingneeded to express the file in the form of a volume.

The replication splitter 226 sits above the volume-file mapping 228. Thereplication splitter 226 is configurable by the replication manager 162on a per-data-object basis to intercept IO requests and to replicate(e.g., mirror) the data specified to be written in such requestsaccording to data-object-specific settings. Depending on the data objectto which the IO request is directed and the replication settings definedfor that object, the replication splitter 226 may allow IO requests itreceives to pass through to the volume-file mapping 228 unimpeded (e.g.,if no replication is specified for that data object). Alternatively, thereplication splitter 226 may intercept the IO request, forward therequest to the replication appliance 160, and hold the request until thereplication splitter 226 receives an acknowledgement back from thereplication appliance 160. Once the acknowledgement is received, thereplication splitter 226 may allow the IO request to continuepropagating down the IO stack 140. It should be understood that thereplication manager 162 can configure the replications splitter 226 in avariety of ways for responding to different types of IO requests 112.For example, replication manager 162 can configure the replicationsplitter 226 to operate in a pass-through mode for control IOs and forIO requests specifying data reads. In some situations, the replicationmanager 162 can configure the replication splitter 226 to interceptreads as well as writes. In any such situations, the replication manager162 can configure the replication splitter 226 on a per-data-objectbasis.

The object-volume mapping layer 224 maps internal volumes to respectivedata objects, such as LUNs, host file systems, and vVOLs. Mappingunderlying volumes to host-accessible LUNs may involve a remappingoperation from a format compatible with the internal volume to a formatcompatible with the LUN. In some examples, no remapping is needed.Mapping internal volumes to host file systems, however, may beaccomplished by leveraging from the fact that file systems arecustomarily built upon volumes, such that an underlying volume is partof the structure of a host file system. Host file systems, also called“upper-deck file systems,” are thus built upon the internal volumespresented by the volume-file mapping 228 to provide hosts with access tofiles and directories. Mapping of vVOLs can be achieved in similar ways.For block-based vVOLs, the object-volume mapping layer 224 may performmapping substantially as it does for LUNs. File-based vVOLs may bemapped, for example, by converting host-specified offsets into vVOLfiles to corresponding offsets into internal volumes.

The protocol end points 220 expose the underlying data objects to hostsin accordance with respective protocols for accessing the data objects.Thus, the protocol end points 220 may expose block-based objects (e.g.,LUNs and block-based vVOLs) using Fiber Channel or iSCSI and may exposefile-based objects (e.g., host file systems and file-based vVOLs) usingNFS, CIFS, or SMB 3.0, for example.

In example operation, the IO stack 140 receives an IO request 112specifying data to be written to a particular data object. Theobject-volume mapping 224 maps the IO request 112 to a block-basedrequest 112 a directed to an internal volume. The replication splitter226 may intercept the block-based request 112 a and send the block-basedrequest 112 a to the replication appliance 160 (or may pass through theIO request, depending on settings established by the replication manager162 for the data object). Assuming the replication splitter 226intercepts the block-based request 112 a, the replication appliance 160coordinates with other components to replicate the data specified in theblock-based request 112 a at a second site and provides the replicationsplitter 226 with an acknowledgement. When the replication splitter 226receives the acknowledgement, the replication splitter 226 allows theblock-based request 112 a to continue propagating down the IO stack 140.The volume-file mapping 228 maps the block-based request 112 a to onethat is directed to a particular file of a lower-deck file system, andthe back end 144 and storage 180 process the IO request by writing thespecified data to actual media. In this manner, the IO stack 140supports both local storage of the data specified in the IO request 112and replication at a second site.

The replication splitter 226 may operate in both a source mode(described above) and in a destination mode. In destination mode, thereplication splitter 226 receives mirrored IO requests arriving fromanother data storage system via the replication appliance 160. Lowerlevels of the IO stack 140 then process the mirrored IO requests toeffect data writes to a local replica.

FIG. 3 shows portions of the front end 142 in additional detail. Here,data objects include a LUN 310, an HFS (host file system) 312, and aVVol 314. The object-volume mapping 224 includes a LUN-to-Volume mapping320, an HFS-to-Volume mapping 322, and a VVol-to-Volume mapping 324.Using the approach described above, the LUN-to-Volume mapping 320 mapsthe LUN 310 to a first volume 324, the HFS-to-Volume mapping 322 mapsthe HFS 312 to a second volume 326, and the Vvol-to-Volume mapping 324maps the VVol 314 to a third volume 328. The replication splitter 226may intercept IOs in accordance with settings established by thereplication manager 262 (as described above). The Volume-to-File mapping228 maps the first, second, and third internal volumes 324, 326, and 328to respective files 336 (F1), 346 (F2), and 356 (F3) in respectivelower-deck files systems 330, 340, and 350. Through the variousmappings, any set of blocks of the LUN 310 specified in an IO request112 is mapped to a corresponding set of blocks within the first volume324 and within the first file 336. Similarly, any file or directory ofthe HFS 312 specified in an IO request 112 is mapped to a correspondingset of blocks within the second volume 326 and within the second file346. Likewise, any portion of the VVol 314 specified in an IO request112 is mapped to a corresponding set of blocks within the third volume328 and within the third file 356.

The lower-deck file systems 330, 340, and 350 each include a respectiveinode table, 332, 342, and 352. Modes 334, 344, and 354 providefile-specific information about the first file 336, the second file,346, and the third file 356, respectively. The information stored ineach inode includes location information (e.g., block locations) wheredata of the respective file are stored.

Although a single file is shown for each of the lower-deck file systems330, 340, and 350, it is understood that each of the lower-deck filesystems 330, 340, and 350 may include any number of files, with eachfile having its own entry in the respective inode table. In one example,each lower-deck file system stores not only the file F1, F2, or F3, butalso snaps of those files, and therefore snaps of the data objectsrealized by the files. Lower-deck file systems may also includeauxiliary files (not shown), such as hybrid log files, which mayaccompany upper-deck file systems, such as HFS 312. Although FIG. 3shows only one host file system (HFS 312), it is understood that anynumber of host file systems may be provided, and that such host filesystems may be grouped together in one or more VSPs.

As shown, the storage pool 232 provisions slices 360 to the file systems330, 340, and 350. Here, slices S1-S3 provide storage for lower-deckfile system 330, slices S4-S7 provide storage for lower-deck file system340, and slices S8 and S9 provide storage for lower-deck file system350.

Because the files F1, F2, and F3 each store entire data objects,including their metadata, the data stored in these files may includeboth non-metadata and metadata. For example, file F2 stores an entirehost file system, including its file data (non-metadata) as well as itsinodes, indirect blocks, per-block metadata, and so forth.

FIG. 4 shows an example set of records 400 of the configuration database170 (FIG. 1), which provide definitions for VSPs 1-N. For each VSP, arecord specifies an owning SP, authentication information, andidentifiers of data objects (e.g., file systems, but in some cases alsoLUNs and/or VVols) associated with the respective VSP, includingidentifiers of internal file systems (e.g., a root file system and aconfiguration file system) and various user file systems or other dataobjects. The record may further specify various host interfaces thatdefine host IO protocols that the respective VSP is equipped to handle.The record for each VSP thus identifies not only data objects associatedwith the VSP, but also a set of interfaces and settings that form a“personality.” This personality enables the VSP to interact with hostsin a manner similar to the way a physical storage processor interactswith hosts. When operated, VSPs are instantiated on the owning SP bystarting their respective host interfaces. The interfaces for each VSPcan respond to host IO requests for reading and writing the data objectsof the respective VSP, which are stored in the storage 180.

II) Continuous Replication on Block-Based and File-Based Objects:

Various arrangements for performing continuous replication will now bedescribed in connection with FIG. 5. As is known, “continuous”replication provides any-point-in-time recovery and may be performedusing synchronous or asynchronous replication technologies.“Synchronous” replication refers to replication performed in band withIO requests as the IO requests are processed. In contrast,“asynchronous” replication is performed out of band with individual IOrequests, with replicas generated, for example, on demand, at regularintervals, and/or in response to particular events.

FIG. 5 shows an example arrangement for performing continuousreplication on a VSP (VSP1) stored on a first data storage system 116(i.e., the data storage system 116 of FIG. 1) to replicate the VSP to asecond data storage system 516. Here, the first data storage system 116is located at a first site 510 (i.e., at a “source”) and the second datastorage system 516 is located at a second site 520 (i.e., at a“destination”). In an example, the first site 510 and the second site520 are located at different geographical locations, such as indifferent rooms or in different buildings of a city or campus, althoughthis is not required. As described in connection with FIG. 1, the firstdata storage system 116 includes persistent storage 180 and isoperatively connected to a first replication appliance 160. Here, thesecond data storage system 516 includes persistent storage 580 (e.g.,disk drives, flash drives, and the like) and is operatively connected toa second replication appliance 560. The second replication appliance 560includes a journal 560 a, which may be implemented in non-volatilememory (e.g., on disk or flash). In some examples, the journal 560 a isimplemented with high-speed, non-volatile memory within the storage 580,e.g., in a LUN. In some examples the data storage systems 116 and 516behave symmetrically, with each site acting as a replica site for dataobjects stored on the other. In the example shown, however, the firstsite 410 acts to receive and process IO requests 112 from hosts foraccessing VSP1, whereas the second site 520 acts to maintain a replicaVSP1-r of VSP1. The second data storage system 516 may be configured ina manner similar or identical to the first data storage system 116. Forexample, although not shown, the second data storage system 516 includesits own communication interface(s) 122, processing units 124, memory130, IO stack 140, replication manager 162, and GUI application 164. Thereplication managers 162 and GUI application 164 may coordinate acrossthe two systems to manage and orchestrate replication, failover, DRtesting, and recovery.

It can be seen that VSP1 includes at least three file systems, labeledFSA, FSB, and FSC. Continuous replication maintains a replica VSP1-r ofVSP1 at the second data storage system 516. The replica VSP1-r includesreplicas of each of VSP1's file systems, i.e., replicas FSA-r, FSB-r,and FSC-r, which are replicas of FSA, FSB, and FSC, respectively.

To replicate VSP1 as a single object, the replication manager 162(FIG. 1) may assign all of the file systems of VSP1 (e.g., FSA, FSB, andFSC) to the same consistency group. In an example, the replicationmanager 162 groups together the volume-files (FIG. 3) for VSP1's dataobjects (e.g., FSA, FSB, and FSC) into a single volume structure (theconsistency group) and performs continuous replication consistentlyacross all volume-files in the group.

The encircled numbers in FIG. 5 identify an example sequence of events.In this example, synchronous replication is shown. At (1), the firstdata storage system 116 receives an IO request 112 specifying data to bewritten in the storage 180 for a particular file system in VSP1 (e.g.,any of FSA, FSB, or FSC). The IO request 112 propagates down the IOstack 140 (FIG. 2) and encounters the replication splitter 226. Thereplication splitter 226 intercepts the IO request and temporarilyprevents the IO request from propagating further down the IO stack 140.

At (2), the replication splitter 226 sends the IO request (e.g., aversion thereof) to the first replication appliance 160. The firstreplication appliance 160 may store the JO request in the journal 160 a.

At (3), the first replication appliance 160 forwards the IO request tothe second replication appliance 560. The second replication appliance560 stores the data specified in the IO request in the journal 560 a.

At (4), the second replication appliance 560 acknowledges safe storageof the data specified in the IO request back to the first replicationappliance 160. For example, the second replication appliance 560acknowledges that the data specified in the IO request have beenpersisted in the journal 560 a.

At (5), the first replication appliance 160 in turn acknowledges receiptto the replication splitter 226. Only when the replication splitter 226receives the acknowledgement from the first replication appliance 160does the replication splitter 226 allow the IO request to continuepropagating down the IO stack 140 (FIG. 2). The replication splitter 226receives the acknowledgement and completes the write operation specifiedin the IO request to the storage 180, e.g., to blocks 512 to which theIO request was directed.

At (6), the first data storage system 116 acknowledges completion of theIO request 112 back to the originating host.

Asynchronously with the IO request, the second replication appliance 560may de-stage data from the journal 560 a to the replica 522 of VSP1maintained in the storage 580. For example, at (7), the data specifiedin the IO request are transferred from the journal 560 a to the storage580, e.g., to blocks 522 storing replica data. At (8), the second datastorage system 516 acknowledges completion.

The arrangement shown in FIG. 5 may be used with slight modification toperform asynchronous replication. For instance, the first replicationappliance 160 may be configured to provide acknowledgements as soon asit persists the specified data locally, e.g., in the journal 160 a. Thefirst replication appliance 160 accumulates data from the replicationsplitter 226 and sends the data to the second site 520 on a regularbasis and/or upon the occurrence of specified events, e.g., inaccordance with settings prescribed in by for VSP1 in the replicationmanager 162. Thus, the arrangement of FIG. 5 supports both synchronousand asynchronous continuous replication. Additional information aboutsynchronous and asynchronous continuous replication of VSPs and theobjects they contain may be found in U.S. patent application Ser. No.14/041,204, filed Sep. 13, 2013, the contents and teachings of which areincorporated by reference as if set forth explicitly herein.

III) Example Improvements for Performing Point-In-Time Recovery of FileSystem and VSP Replicas:

Techniques will now be described in connection with FIGS. 6-11 forperforming point-in-time recovery of file systems and VSPs. Thesetechniques may be performed, for example, in the environment 100 ofSection I using the continuous replication processes described inSection II.

FIG. 6 shows an example realization of portions of the journal 560 a.The journal 560 a organizes information to assist with continuousreplication and to enable point-in-time recovery. In this example, thejournal 560 a organizes information in records by record ID (RID), shownas rows, where each record corresponds to a single replicationoperation. Each replication operation may include one or more IOrequests 112 received by the first data storage system 116 and mirroredto the second data storage system 516. FIG. 6 shows several records inreverse chronological order, ranging from a most recent record 612 to anolder record 616. In an example, the journal 560 a persists a very largenumber of records, corresponding to a very large number of replicationoperations. In some examples, the RID is an auto-incrementing number. Inother examples, a timestamp or some other unique value may serve as theRID.

As shown, the journal 560 a stores, for each record listed, a timestamp,an identifier of the consistency group (CGID) to which the respectivereplication operation is directed, and a set of changes <Deltas> appliedto the replica of the respective consistency group, e.g., performed onthe consistency group at the source and mirrored to the destination. Forexample, these changes include a list of block locations and associatedvalues to be applied to the identified consistency group by therespective replication operation. If the consistency group represents asingle file system (e.g., FSA), then the set of changes indicateschanges made to the volume-file for that file system. If the consistencygroup represents a VSP, then the set of changes identifies changes madeto any of the volume-files grouped together by the VSP. In all cases,the changes (deltas) provide data for mirroring changes made to a dataobject in the first data storage system 116 to a replica in the seconddata storage system 516.

The journal 560 a can further be seen to include, for each RID, undoinformation <Undo> and redo information <Redo>. The undo information fora given replication operation includes changes (e.g., block locations,modifications, etc.) required to reverse, or “undo,” any changes(Deltas) made to a consistency group as a result of having performedthat replication operation. For example, the undo information mayinclude block locations and values of a replica where changes (deltas)were applied. Thus, applying the undo information for a particularreplication operation has the effect of nullifying the changes (deltas)made by applying that replication operation to the consistency group andthus of restoring the consistency group to its previous state. The redoinformation for a particular replication operation has the effect ofreversing the effect of having applied the undo information. In someexamples, the redo information for a particular replication operation issimilar or identical to the deltas.

In some examples, the journal 560 a may associate any of the recordswith a respective “marker.” For instance, an administrator or other userof the first data storage system 116 may insert a marker into an ongoingreplication session, e.g., by operating the GUI application 164, to marka particular point in time. Alternatively, an application running on ahost may insert a marker automatically. In either case, the replicationmanager 162 applies that marker to the next replication operation, e.g.,as metadata with the next mirrored IO request, such that the markertravels from the first data storage system 116 to the second datastorage system 516 at a known point in time. In the example of FIG. 6, amarker named “App Con” has been inserted as metadata and recorded in thejournal 560 a with record 614.

To perform point-in-time recovery for a particular data object, anadministrator may operate the GUI application 164 to view selectedcontent of the journal 560 a and to select a point in time to which toroll back. For example, the GUI application 164 may receive input fromthe administrator and generate, in response to the input, a rollbackrequest 620. Here, the rollback request 620 identifies, based on theadministrator's selection, the record 614, which corresponds to aprevious point in time, shown as “T.” It should be understood, though,that the rollback request 620 may specify any point in time, i.e., anyof the records for that data object listed in the journal 560 a. In someexamples, rollback granularity may be provided down to the level ofindividual IO requests.

In response to receiving the rollback request 620, the replicationmanager 162 orchestrates recovery of the replica of the selected dataobject to the designated point in time. For example, the replicationmanager 162 directs recovery activities to apply changes specified inthe undo information 630 for the data object that have accrued since thetime T. In an example, the recovery activities apply undo information tothe selected data object in reverse-chronological order, undoing themost recent change first and continuing in order until all changes havebeen undone back to the time T. Although the journal 560 a is shown toinclude records for multiple objects (CGIDs), it should be understoodthat undo information is applied only for the selected data object,i.e., the data object that the administrator has chosen to roll back.

Given this framework, it is clear that the administrator may also rollforward in time, e.g. by providing input to the GUI application 164, toselect a more recent point in time, including the most recent point intime. To roll forward, replication activities apply redo information forthe designated data object to apply changes, e.g., inforward-chronological order, beginning from the currently selected pointin time and proceeding, in order, to the newly selected point in time.

In some examples, an application running on one or more of the hosts110(1-N) (FIG. 1) may have activities in flight that must run tocompletion before the application can assume a consistent and/orrecoverable state. For instance, the application may internally queue IOrequests and/or may perform processing to form IO requests to completesome activity. If an error occurs in the first data storage system 116while these activities are in process, the application might not be ableto recover easily from the error and proceed. To provide the option toavoid recovering from a replica that reflects an inconsistent state ofthe application, an administrator may temporarily quiesce theapplication to allow the application to assume a consistent state, andthen manually insert a marker, such as “App Con” (FIG. 6). Theadministrator may then resume the application. The marker communicatesthe application-consistent state information to the journal 560 a in thenext replication operation. Later, if a file system, VSP, or otherobject experiences an error such that recovery is required, it might bebetter to roll back the replica to a point in time when the applicationwas in the consistent state. It should be understood that rollback to anapplication-consistent point in time is not required, however. Forinstance, it may be preferable to recover from a more recent point intime, e.g., to avoid data loss, even if doing so comes at the cost ofhaving to repair or reset the application.

FIG. 7 shows example activities that may be performed when restoring aVSP to a previous point in time. The activities described may be appliedto file systems individually, as well. In this example, VSP1-r, on thesecond data storage system 516, provides a replica of VSP1, on the firstdata storage system 116 (FIG. 5). VSP1-r groups together file systemsFSA-r, FSB-r, and FSC-r, which respectively provide replicas of filesystems FSA, FSB, and FSC. In this example, each of the file systemreplicas FSA-r, FSB-r, and FSC-r has an associated transaction log,identified here as LOGa-r, LOGb-r, and LOGc-r, respectively. Thesetransaction logs LOGa-r, LOGb-r, and LOGc-r themselves are replicas oftransaction logs provided for file systems FSA, FSB, and FSC,respectively, in the first data storage system 116. Each transaction logstores pending metadata transactions that are yet to be applied to itsrespective file system. The metadata transactions may describe, forexample, updates to inodes, indirect blocks, or other metadatastructures in the file system. In an example, each transaction log isreplicated along with its respective file system. For instance, atransaction log may be implemented in a separate file (e.g., anauxiliary file), and the separate file may be assigned to a consistencygroup along with the volume-file that realizes the file system to whichthe transaction log belongs (FIG. 3). Alternatively, the transaction logmay itself be embedded within the file system to which it belongs, e.g.,in a dedicated file system subspace, such that replicating the filesystem inherently replicates the transaction log. Other types oftransaction logs may be used; these are merely examples. In any case,restoring a file system to a particular point in time may involve alsorestoring the transaction log for that file system to the same point intime. When restoring a VSP to a previous point in time, the file systemsgrouped by the VSP are all restored to the same point in time, alongwith their respective transaction logs.

When restoring a file system replica to a previous point in time, thefile system replica may be left in an incomplete state, which reflectsan incomplete state of the source file system. The state of the filesystem and the replica may be incomplete because pending metadatatransactions from the transaction log have not yet been applied to thefile system. Thus, when rolling back a file system to a previous pointin time, restore activities may include applying the pendingtransactions from the transaction log to the metadata structures in thefile system.

As shown in FIG. 7, restore activities 710 apply pending logtransactions from LOGa-r to FSA-r, e.g., by writing the pendingtransactions to the metadata structures in FSA-r. Similarly, restoreactivities 720 and 730 apply pending log transactions to file systemsFSB-r and FSC-r, respectively. At the conclusion of operations 710, 720,and 730, the file systems grouped by VSP1-r are each in a complete,consistent state and are available for user access.

It should be understood that applying log transactions to file systemsinvolves making changes to the file systems. In some examples, suchchanges are provided in the form of IO requests that are processed bythe IO stack 140 of the second data storage system 516. A replicationsplitter 226, within the IO stack of the second data storage system 516,may intercept each of the IO requests en route to the storage 580 andforward the IO request to the journal 560 a. The journal 560 a may thenrecord data specified by the IO requests in applying transactions from atransaction log, with such data forming one or more new records in thejournal (new deltas). Associated undo and redo information may beprovided, such that writes from the log may be undone or redone asdesired.

FIGS. 8A and 8B show example operations when performing DR (DisasterRecovery) testing on a VSP. The operations described may be applied tofile systems individually, as well.

In FIG. 8A, it is assumed that VSP1-r has been restored to a previouspoint in time, e.g., in response to a rollback request 620 in the mannerdescribed in connection with FIGS. 6 and 7, and that an administrator orother user has entered a DR testing mode. Here, the second data storagesystem 516 receives IO requests 112, e.g., from the administrator, toexercise VSP1-r, e.g., to perform reads and writes, in an effort toascertain whether VSP1-r could take over for VSP1 in the event of afailure at the first data storage system 116.

In contrast with previous approaches to DR testing, which involve takingsnaps of a volume and then reading and writing the snaps to assess thestate of the replicated object, DR testing in this example is performeddirectly on the rolled-back replica, rather than on a snap. It isbelieved that performing DR testing on the rolled-back replica itselfprovides more accurate DR testing results, as one is exercising the verysame object to which failover would occur through the very same datapath, e.g., not through other metadata structures as would be the casewith a snap.

In an example, the administrator issues an IO request 112 w specifyingdata to be written to a set of blocks of FSA-r (one block shown). Priorto processing the IO request 112 w, the value of the addressed block is810 a. After processing the IO request 112 w, the value of the sameblock will be 810 b. When processing the IO request 112 w in the IOstack 140, the replication splitter 226 intercepts the IO request 112 w,reads the current value of the addressed block from FSA-r, and storesthe current value 810 a in a new record in the journal 560 a, i.e., asundo information. Once the data 810 a are persisted in the journal 560 aas undo information, the replication splitter 226 may allow the IOrequest 112 w to write the data 810 b to the addressed block of FSA-r(FIG. 5). DR testing may proceed in this fashion, with every write toany object in VSP1-r causing the journal 560 a to preserve the change ina new record along with associated undo information.

In FIG. 8B, a command 830 is received to exit DR testing mode. Inresponse to the command 830, the replication manager 152 orchestratesactions to restore VSP1-r to a current state, i.e., a state thatreflects the current or nearly current state of VSP1. Alternatively,activities may restore VSP1-r to some other point in time. Here, restoreactivities access the journal 560 a and perform undo operations toreverse the changes applied to VSP1-r during DR testing. Restoreactivities include changing the value of the illustrated block from 810b back to 810 a, as well as making similar changes to other blockschanged during DR testing. Restore activities may further includeundoing changes in the file systems FSA-r, FSB-r, and FSC-r thatresulted from applying log transactions.

FIGS. 9 and 10 show examples screen shots generated by the GUIapplication 164. In an example, the GUI application 164 provides asingle tool for managing data objects throughout their lifecycles. Thus,for example, an administrator may operate the GUI application 164 tocreate or destroy data objects, to establish replication, failover, andrecovery settings, and to perform DR testing. It should be understoodthat the particular controls and features shown in FIGS. 9 and 10 may beimplemented in many alternative ways, and that the examples shown areintended to be merely illustrative.

For instance, as shown in FIG. 9, an example screen 900 has a control910 for selecting a data object. When the user clicks on the indicatedarrow, the GUI application 164 displays a list 920 of data object types.If the user then clicks one of the displayed object types, such as VSPs922, the GUI application 164 displays a list 930 of particular VSPs. Theadministrator may select one of the VSPs. For example, the administratormay select VSP1 and operate control 950 to select a replication type 960(e.g., Sync or Async). The administrator may then click a setup button980. It should be understood that the administrator may alternativelyselect a file system or some other object type to configure. Theoperations described herein may work in similar ways, regardless of thetype of object selected.

FIG. 10 shows an example screen 1000, which the GUI application 164 maydisplay in response to the setup button 980 being clicked (FIG. 9).Here, the administrator may specify settings for VSP1. These may includesettings for replication, failover, and recovery, as shown in buttons tothe left (clicking them may open new screens). They also include abutton 1040 to start a DR test, a button 1050 to exit a DR test, and aslider 1010 to roll VSP1 back in time. For example, the administratormay move bar 1020 left or right to identify a desired point in time towhich to roll back VSP1. Moving the bar 1020 causes an internal cursorto move relative to the journal 560 a (FIG. 5), such that moving the bar1020 to the left moves an internal cursor down, to older records, whilemoving the bar to the right moves the internal cursor up, to more recentrecords. Marked locations 1030 a, 1030 b, and 1030 c indicateapplication-consistent markers, such as the “App Con” marker shown inFIG. 6. The user thus has the option to roll back to anapplication-consistent point in time if the user so chooses. With thebar 1020 set to the desired point in time, the user may click the button1036 to roll back, whereupon the GUI application 164 directs thereplication manager 162 to orchestrate activities to roll back VSP1-r tothe identified point in time, i.e., in the manner described inconnection with FIGS. 6 and 7. If the user wishes, the user may thenclick button 1040 to start DR testing on VSP1-r, e.g., in the mannerdescribed in connection with FIG. 8A. When the user is finished, theuser may click the button 1050 to exit DR testing. In response, the GUIapplication 164 directs the replication manager 162 to orchestrateactivities to undo changes made to VSP1-r during the DR testing, e.g.,in the manner described in connection with FIG. 8B. These changes mayinclude undoing log transactions applied to the file systems FSA-r,FSB-r, and/or FSC-r.

FIG. 11 shows an example process 1100 for managing file system replicasin a data storage system and provides a summary of some of theoperations described above. The process 1100 may be carried out, forexample, by the second data storage system 516, which may act incooperation with the first data storage system 116 to perform continuousreplication on a file system or VSP. The second data storage system 516may perform acts of the process 1100 by the set of processing units 124executing instructions in the memory 130 of the second data storagesystem.

At 1110, continuous replication is performed to maintain a replica of afile system. The continuous replication (i) specifies changes to be madeto the file system and mirrored to the replica, (ii) persists thechanges and associated timestamps in a journal, and (iii) applies thechanges persisted in the journal to the replica. For example, the seconddata storage system 516 performs continuous replication, in coordinationwith activities at the first data storage system 116, to maintain areplica (e.g., FSA-r) of a file system (e.g., FSA). The continuousreplication provides IO requests 112 specifying data to be written toFSA to the replica, FSA-r, persists the data specified in the IOrequests in a journal 560 a (e.g., in “deltas”) with associatedtimestamps (FIG. 6), and applies the changes from the journal 560 a tothe replica FSA-r (FIG. 5).

At 1112, a request is received to roll back the replica of the filesystem to a previous point in time. For example, an administrator orother user may adjust the bar 1020 on slider 1010 (FIG. 10) to identifya desired previous point in time and may click the button 1036 toinitiate rollback.

At 1114, in response to receiving the request, (i) the journal isaccessed to identify, based on the timestamps, a set of the changes madeto the replica since the previous point in time and (ii) the set of thechanges in the replica is undone to restore the replica to the previouspoint in time. For example, clicking the button 1036 initiates asequence of activities, as described in connection with FIGS. 6 and 7,which identify records in the journal 560 a that have accrued since theindicated point in time and perform undo operations to undo the changesand restore the replica to the indicated point in time.

These activities may be performed on a single file system, on multiplefile systems, or on a VSP. The VSP may group together multiple filesystems and/or other data objects. Thus, the benefits of continuousreplication and point-in-time recovery are extended to include filesystems, and the functionality for file systems is extended to includeVSPs. The improved technique thus provides flexible recovery options forfile systems and VSPs and provides an effective vehicle for performingDR testing on the actual object or objects that may be relied upon inthe event of failover.

Having described certain embodiments, numerous alternative embodimentsor variations can be made. For example, although continuous replicationis shown and described between a first data storage system 116 and asecond data storage system 516, continuous replication may also beperformed by a single data storage system, e.g., for providing a localtarget from which to perform recovery.

Also, although embodiments have been described for performing continuousreplication with the aid of replication appliances 160 and 560 andreplication splitters 226, this is merely an example, as theimprovements hereof may be realized with any continuous replicationtechnology.

Further, although features are shown and described with reference toparticular embodiments hereof, such features may be included and herebyare included in any of the disclosed embodiments and their variants.Thus, it is understood that features disclosed in connection with anyembodiment are included as variants of any other embodiment.

Further still, the improvement or portions thereof may be embodied as acomputer program product including one or more non-transient,computer-readable storage media, such as a magnetic disk, magnetic tape,compact disk, DVD, optical disk, flash drive, SD (Secure Digital) chipor device, Application Specific Integrated Circuit (ASIC), FieldProgrammable Gate Array (FPGA), and/or the like (shown by way of exampleas medium 1150 in FIG. 11). Any number of computer-readable media may beused. The media may be encoded with instructions which, when executed onone or more computers or other processors, perform the process orprocesses described herein. Such media may be considered articles ofmanufacture or machines, and may be transportable from one machine toanother.

As used throughout this document, the words “comprising,” “including,”“containing,” and “having” are intended to set forth certain items,steps, elements, or aspects of something in an open-ended fashion. Also,as used herein and unless a specific statement is made to the contrary,the word “set” means one or more of something. This is the caseregardless of whether the phrase “set of” is followed by a singular orplural object and regardless of whether it is conjugated with a singularor plural verb. Further, although ordinal expressions, such as “first,”“second,” “third,” and so on, may be used as adjectives herein, suchordinal expressions are used for identification purposes and, unlessspecifically indicated, are not intended to imply any ordering orsequence. Thus, for example, a second event may take place before orafter a first event, or even if no first event ever occurs. In addition,an identification herein of a particular element, feature, or act asbeing a “first” such element, feature, or act should not be construed asrequiring that there must also be a “second” or other such element,feature or act. Rather, the “first” item may be the only one. Althoughcertain embodiments are disclosed herein, it is understood that theseare provided by way of example only and that the invention is notlimited to these particular embodiments.

Those skilled in the art will therefore understand that various changesin form and detail may be made to the embodiments disclosed hereinwithout departing from the scope of the invention.

What is claimed is:
 1. A method of managing file system replicas in adata storage system, the method comprising: performing continuousreplication to maintain a replica of a file system, the continuousreplication (i) specifying changes to be made to the file system andmirrored to the replica, (ii) persisting the changes and associatedtimestamps in a journal, and (iii) applying the changes persisted in thejournal to the replica; receiving a request to roll back the replica ofthe file system to a previous point in time; and in response toreceiving the request, (i) accessing the journal to identify, based onthe timestamps, a set of the changes made to the replica since theprevious point in time and (ii) undoing the set of the changes in thereplica to restore the replica to the previous point in time, whereinthe method further comprises: realizing the replica of the file systemin a container file stored in a container file system of the datastorage system; and exposing the container file as a volume of the datastorage system, wherein, when applying the changes persisted in thejournal to the replica, the method includes applying the changespersisted in the journal to the volume, and wherein, when performingcontinuous replication to maintain the replica of the file system, themethod further comprises: performing continuous replication on atransaction log of the file system to maintain a log replica, the logreplica storing pending metadata transactions to the file system; andafter restoring the replica of the file system to the previous point intime, applying the pending metadata transactions from the log replica tothe replica of the file system, wherein the file system is one ofmultiple file systems grouped together in a VSP (Virtualized StorageProcessor), wherein, when performing continuous replication to maintainthe replica of the file system, the method comprises performingcontinuous replication on each of the multiple file systems to maintaina replica of the VSP, and wherein the method further comprises managingmultiple lifecycle events of the VSP, including DR testing, from asingle management application.
 2. The method of claim 1, furthercomprising, after restoring the replica of the file system to theprevious point in time, processing IO requests directed to the replicaof the file system to effect read and write operations on the replica aspart of performing DR (Disaster Recovery) testing.
 3. The method ofclaim 2, wherein processing the IO requests includes processing an IOrequest to effect a write operation that overwrites a set of blocks inthe replica of the file system, and wherein, to effect the writeoperation, the method further includes, prior to overwriting the set ofblocks, providing data from the set of blocks in the journal to preservevalues of the set of blocks in the journal.
 4. The method of claim 3,wherein the method further comprises: receiving a request to exit DRtesting; in response to receiving the request to exit DR testing,copying the data provided in the journal from the set of blocks back tothe set of blocks to restore the set of blocks to their state prior toprocessing the IO request.
 5. The method of claim 3, wherein the filesystem is one of multiple data objects grouped together in a VSP(Virtualized Storage Processor), wherein, when performing continuousreplication to maintain the replica of the file system, the methodcomprises performing continuous replication on each of the multiple dataobjects to maintain a replica of the VSP, and wherein performing DRtesting on the replica of the file system is part of a process forperforming DR testing on the VSP.
 6. The method of claim 5, wherein themultiple data objects grouped together in the VSP include the filesystem as well a set of other objects, the set of other objectsincluding at least one of (i) another file system, (ii) a LUN (LogicalUnit Number), or (iii) a VVol (Virtual Volume), and wherein the replicaof the VSP includes a replica of each of the set of other objects. 7.The method of claim 1, wherein performing continuous replicationincludes performing discrete update operations on the replica of VSPkeep the replica of the VSP current with changes made to the VSP, andwherein the method further comprises: receiving a message in one of thediscrete update operations that identifies a point in time at which anapplication accessing the VSP is in an application-consistent state,wherein the request to roll back the replica to the previous point intime is a request to roll back the replica to the point in time at whichthe application accessing the VSP was in the application-consistentstate.
 8. A data storage system comprising control circuitry constructedand arranged to: perform continuous replication to maintain a replica ofa file system, the continuous replication (i) specifying changes to bemade to the file system and mirrored to the replica, (ii) persisting thechanges and associated timestamps in a journal, and (iii) applying thechanges persisted in the journal to the replica; receive a request toroll back the replica of the file system to a previous point in time;and in response to receiving the request, (i) access the journal toidentify, based on the timestamps, a set of the changes made to thereplica since the previous point in time and (ii) undoing the set of thechanges in the replica to restore the replica to the previous point intime, wherein the control circuitry, constructed and arranged to performcontinuous replication to maintain the replica of the file system, isfurther constructed and arranged to: perform continuous replication on atransaction log of the file system to maintain a log replica, the logreplica storing pending metadata transactions to the file system; andafter restoring the replica of the file system to the previous point intime, apply the pending metadata transactions from the log replica tothe replica of the file system, wherein the file system is one ofmultiple file systems grouped together in a VSP (Virtualized StorageProcessor), wherein, when constructed and arranged to perform continuousreplication to maintain the replica of the file system, the controlcircuitry is further constructed and arranged to perform continuousreplication on each of the multiple file systems to maintain a replicaof the VSP, and wherein the control circuitry is further constructed andarranged to manage multiple lifecycle events of the VSP, including DRtesting, from a single management application.
 9. The data storagesystem of claim 8, wherein the control circuitry is further constructedand arranged to: realize the replica of the file system in a containerfile stored in a container file system of the data storage system; andexpose the container file as a volume of the data storage system,wherein, when constructed and arranged to apply the changes persisted inthe journal to the replica, the control circuitry is further constructedand arranged to apply the changes persisted in the journal to thevolume.
 10. A computer program product including a set ofnon-transitory, computer-readable media having instructions which, whenexecuted by control circuitry, cause the control circuitry to perform amethod of managing replicas, the method comprising: performingcontinuous replication to maintain a replica of a VSP (VirtualizedStorage Processor), the VSP including multiple file systems, thecontinuous replication (i) specifying changes to be made to the VSP andmirrored to the replica, (ii) persisting the changes and associatedtimestamps in a journal, and (iii) applying the changes persisted in thejournal to the replica; receiving a request to roll back the replica ofthe VSP, including each of the multiple file systems, to a previouspoint in time; and in response to receiving the request, (i) accessingthe journal to identify, based on the timestamps, a set of the changesmade to the replica of the VSP since the previous point in time and (ii)undoing the set of the changes in the replica of the VSP to restore thereplica to the previous point in time, wherein, when performingcontinuous replication to maintain the replica of the file system, themethod further comprises: performing continuous replication on atransaction log of the file system to maintain a log replica, the logreplica storing pending metadata transactions to the file system; andafter restoring the replica of the file system to the previous point intime, applying the pending metadata transactions from the log replica tothe replica of the file system, wherein the file system is one ofmultiple file systems grouped together in a VSP (Virtualized StorageProcessor), wherein, when performing continuous replication to maintainthe replica of the file system, the method comprises performingcontinuous replication on each of the multiple file systems to maintaina replica of the VSP, and wherein the method further comprises managingmultiple lifecycle events of the VSP, including DR testing, from asingle management application.
 11. The computer program product of claim10, wherein the replica of the VSP includes a file system replica foreach of the multiple file systems that the VSP includes, and wherein themethod further comprises: realizing the file system replicas inrespective container files stored in a set of container file systems ofthe data storage system; and exposing each container file as arespective volume of the data storage system, wherein, when applying thechanges persisted in the journal to the replica, the method includesapplying the changes persisted in the journal for each of the multiplefile systems to the respective volume.
 12. The computer program productof claim 11, further comprising, after restoring the replica of the VSPto the previous point in time, processing IO requests directed to thereplica of the VSP to effect read and write operations on the replica ofthe VSP as part of performing DR (Disaster Recovery) testing.
 13. Thecomputer program product of claim 12, wherein processing the IO requestsincludes processing an IO request to effect a write operation thatoverwrites a set of blocks in the replica of the VSP, and wherein, toeffect the write operation, the method further includes, prior tooverwriting the set of blocks, providing data of the set of blocks inthe journal to preserve values of the set of blocks in the journal. 14.The computer program product of claim 13, wherein the method furthercomprises: receiving a request to exit DR testing; in response toreceiving the request to exit DR testing, copying the data provided inthe journal from the set of blocks back to the set of blocks to restorethe set of blocks to their state prior to processing the IO request. 15.The computer program product of claim 14, wherein the VSP furtherincludes a set of other objects, the set of other objects including atleast one of (i) a LUN (Logical Unit Number) or (ii) a VVol (VirtualVolume), wherein the replica of the VSP includes a replica of each ofthe set of other objects, and wherein the method further comprises:realizing the replica of each of the set of other objects in respectiveother container files stored in the set of container file system of thedata storage system; and exposing each of the other container files as arespective volume of the data storage system, wherein, when applying thechanges persisted in the journal to the replica, the method includesapplying the changes persisted in the journal for each of the set ofother objects to the respective volume.